Document Similarity Amid Automatically Detected Terms∗
نویسندگان
چکیده
This is the second edition of the task formally known as Question Answering for the Spoken Web (QASW). It is an information retrieval evaluation in which the goal was to match spoken Gujarati “questions” to spoken Gujarati responses. This paper gives an overview of the task—design of the task and development of the test collection—along with differences from previous years.
منابع مشابه
Playing with distances: Document Similarity
Spoken information retrieval is a promising domain of research. In this paper we describe our participation in the pilot Document Similarity Amid Automatically Detected Terms task of FIRE 2014. We present the findings on our experiments with variants of distance and timestamp based approaches. The de-normalized distance based variant outperformed other two delivering best results of the submitt...
متن کاملApplication of Localized Similarity for Web Documents
In this paper we present a novel approach to automatic creation of anchor texts for hyperlinks in a document pointing to similar documents. Methods used in this approach rank parts of a document based on the similarity to a presumably related document. Ranks are then used to automatically construct the best anchor text for a link inside original document to the compared document. A number of di...
متن کاملEnhancement of Search Results Using Dynamic Document Seed Reranking Algorithm
We proposed an algorithm to improve the precision of top retrieved documents by reordering the retrieved documents in the initial retrieval. To re-order the documents, we first automatically extract key terms and key phrases from top N retrieved documents and generate a document index for each document. Using the standard similarity metrics, a document similarity matrix is generated for these d...
متن کاملContributions on Semantic Similarity and Its Applications to Data Privacy
Semantic similarity aims at quantifying the resemblance between the meaning of textual terms. Thus, it represents the corner stone of textual understanding. Given the increasing availability and importance of textual sources within the current context of Information Societies, a lot of attention has been put in recent years in the development of mechanisms to automatically measure semantic simi...
متن کاملSpoken Lecture Summarization by Random Walk over a Graph Constructed with Automatically Extracted Key Terms
This paper proposes an improved approach for spoken lecture summarization, in which random walk is performed on a graph constructed with automatically extracted key terms and probabilistic latent semantic analysis (PLSA). Each sentence of the document is represented as a node of the graph and the edge between two nodes is weighted by the topical similarity between the two sentences. The basic i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014